6 research outputs found
Recommended from our members
Statistical and perceptual properties of images and videos with applications
The visual brain is optimally designed to process images from the natural environment that we perceive. Describing the natural environment statistically helps in understanding how the brain encodes those images efficiently. The Natural Scene Statistics (NSS) of the luminance component of images is the basis of several univariate statistical models. Such models were the fundamental building blocks of multiple visual applications, ranging from the design of faithful image and video quality models to the development of perceptually optimized image enhancing techniques. Towards advancing this area, I studied the bivariate statistical properties of images and developed the first of its kind closed-form model that describes the correlation of spatially separated bandpass image samples. I found that the model was useful in tackling different problems such as blindly assessing the quality of images and assessing 3D visual discomfort of stereo images. Provided the success of NSS in tackling image processing problems, I decided to use them as a tool to tackle the blind video quality assessment (VQA) problem. First, I constructed a video quality database, the LIVE Video Quality Challenge Database (LIVE-VQC). This database is the largest across several key dimensions: number of unique contents, distortions, devices, resolutions, and videographers. For collecting the subjective scores, I constructed a new framework in Amazon Mechanical Turk. A massive number of subjects from across the globe participated in my study. Those efforts resulted in a VQA database that serves as a great benchmark for real-world videos. Next, I studied the spatio-temporal statistics of a wide variety of natural videos and created a space-time completely blind VQA model that deploys a directional temporal NSS model to predict quality. My newly created model outperforms all previous completely blind VQA models on the LIVE-VQCElectrical and Computer Engineerin
Recommended from our members
A closed-form correlation model of oriented bandpass natural images beyond adjacent responses
textBuilding natural scene statistical models is crucial for a large set of applications starting from the design of faithful image and video quality metrics to image enhancing techniques. Most predominant statistical models of natural images characterize univariate distributions of divisively normalized bandpass responses or wavelet-like decomposition of them. Previous models focusing on these bandpass natural responses offer optimized solutions to numerous problems in image processing however, these models have not focused on finding a closed-form quantative model capturing the bivariate natural statistics. Towards the efforts for filling this gap, Su et. al recently modeled spatially horizontally neighboring bandpass image responses on multiple scales; however, the latter work did not cover the response of distant bandpass image responses with various spatial orientations. This work builds on Su. et al 's model and extends the closed-form correlation model to cover distant bandpass image responses, up to a distance of ten pixels; with multiple spatial orientations, encompassing all the discrete spatial angles for the lastly-mentioned distances on multiple scales.Electrical and Computer Engineerin
Quality Assessment of In-the-Wild Videos
Quality assessment of in-the-wild videos is a challenging problem because of
the absence of reference videos and shooting distortions. Knowledge of the
human visual system can help establish methods for objective quality assessment
of in-the-wild videos. In this work, we show two eminent effects of the human
visual system, namely, content-dependency and temporal-memory effects, could be
used for this purpose. We propose an objective no-reference video quality
assessment method by integrating both effects into a deep neural network. For
content-dependency, we extract features from a pre-trained image classification
neural network for its inherent content-aware property. For temporal-memory
effects, long-term dependencies, especially the temporal hysteresis, are
integrated into the network with a gated recurrent unit and a
subjectively-inspired temporal pooling layer. To validate the performance of
our method, experiments are conducted on three publicly available in-the-wild
video quality assessment databases: KoNViD-1k, CVD2014, and LIVE-Qualcomm,
respectively. Experimental results demonstrate that our proposed method
outperforms five state-of-the-art methods by a large margin, specifically,
12.39%, 15.71%, 15.45%, and 18.09% overall performance improvements over the
second-best method VBLIINDS, in terms of SROCC, KROCC, PLCC and RMSE,
respectively. Moreover, the ablation study verifies the crucial role of both
the content-aware features and the modeling of temporal-memory effects. The
PyTorch implementation of our method is released at
https://github.com/lidq92/VSFA.Comment: 9 pages, 7 figures, 4 tables. ACM Multimedia 2019 camera ready. ->
Update alignment formatting of Table